-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for block pruning #116
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change is pretty scary. It seems like pruning is now done but I think the way it is now isn't safe for the network.
We now have a state where users might prune blocks without the syncer notifying other peers about not being full node. So imo this is only safe to use once we have a solution for that.
Imo that involves signaling whether you are a full node and what your latest known block is. Then have the syncer only connect to peers which are either full nodes or "partial" nodes >= your required block height. Otherwise we might run into weird issues and network fragmentation in the worst case.
We should definitely test what happens when you try to sync from a pruned node. Looking at |
Will the pruned peer ever be dinged enough to be banned? That could cause problems if the pruned node ends up banned by all the full nodes over time for errored RPC 🤔 |
Currently |
Been a long time coming. 😅
The strategy here is quite naive, but I think it will be serviceable. Basically, when we apply a block
N
, we delete blockN-P
.P
is therefore the "prune target," i.e. the maximum number of blocks you want to store.In practice, this isn't exhaustive: it only deletes blocks from the best chain. It also won't dramatically shrink the size of an existing database. I think this is acceptable, because pruning is most important during the initial sync, and during the initial sync, you'll only be receiving blocks from one chain at a time. Also, we don't want to make pruning too easy; after all, we need a good percentage of nodes to be storing the full chain, so that others can sync to them.
I tested this out locally with a prune target of 1000, and after syncing 400,000 blocks, my
consensus.db
was around 18 GB. This is disappointing; it should be much smaller. With some investigation, I found that the Bolt database was only storing ~5 GB of data (most of which was the accumulator tree, which we can't prune until after v2). I think this is a combination of a) Bolt grows the DB capacity aggressively in response to writes, and b) Bolt never shrinks the DB capacity. So it's possible that we could reduce this number by tweaking our DB batching parameters. Alternatively, we could provide a tool that copies the DB to a new file. Not the most user-friendly, but again, I think I'm okay with that for now.Depends on SiaFoundation/core#228